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by 

Christos H. Papadimitriou 
Massachusetts Institute of Technology 



Abstract 

A sequence of interleaved user transactions in a database system may not 
be aevializdble, i.e., equivalent to some sequential execution of the 
individual transactions. Using a simple transaction model we show that 
recognizing the transaction histories which are serializable is an NP- 
complete problem. We therefore introduce several efficiently recognizable 
Subclasses of the class of serializable histories; most of these sub- 
classes correspond to serializability principles existing in the 
literature and used in practice. We also propose two new principles 
which subsume all previously known ones. We give necessary and sufficient 
conditions for a class of histories to be the output of an efficient 
history scheduler; these conditions imply that there can be no efficient 
scheduler that outputs all of serializable histories, and also that all 
subclasses of serializable histories studied above have an efficient 
scheduler. Finally, we show how our results can be extended to far more 
general transaction models, to transactions with partly interpreted 
functions, and to distributed database systems. 
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1. INTRODUCTION 

In many situations many users may consult and update a common data- 
base. We can think of such independent user transactions as sequences of 
atomic database operations, interleaved with computations that are local 
to the user, that is, they do not affect or depend on the current state 
of the database. It is a function of database management to handle the 
update and retrieval requests made by the users in such a way so that the 
resulting overall process is in some appropriate sense correct. It is 
generally accepted— see> for example, fSLR] , [SK] , [BGLT] , [bpr] — -that 
the right notion of correctness in this context is that of 8eriaUaaMUty • 
A sequence of atomic user updates/retrievals is called serializable 
tSlfn^ially if its overall effect is as though the users took tu?n«, i» 
some order, executing each their entire transaction indivisibly, The 
simplest example of a non-serializable sequence is a primitive form of a 
"race". Imagine two users that increment a counter by first sensing its 
value, and later registering an increased one. If both users retrieve 
the value of the counter before either of them has updated it, the 
resulting execution sequence—or history— is not serializable. This is 
because both possible serial executions of these transactions would have 
resulted in a larger total increment. Naturally, much subtler examples 

exist. 

The appeal of serializability as a correctness criterion is quite 
easy to justify. Databases are supposed to be faithful models of parts 
of the world, and user transactions represent instantaneous changes in 
the world. Since such changes are totally ordered by temporal priority, 
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the only acceptable interleavings of atomic steps of different trans- 
actions are those that are equivalent to some sequential execution of 
these transactions. Another way of viewing serializability is as a tool 
for ensuring system correctness. If each user transaction is correct-- i.e. , 
when run by itself, it is guaranteed to map consistent states of the data- 
base to consistent states — and transactions are guaranteed to be inter- 
mingled in a serializable way, then the overall system is also correct. 
In this paper we consider transactions that consist of two atomic 
actions: a retrieval of the values of a set of database entities — called 
the read-aet of the transaction— followed by an update of the values of 
another set of entities — the wpite-eet. This is exactly the kind of 
transactions handled by the system SDD-1 fBGHP] , [RGJ . However, the 
main reason for considering this model here is that it provides a nice 
framework for understanding and comparing very different philosophies of 
serializability that already exist in the literature — e.g., [BS], [SLR], 
[EGLT] , [BGRP] . Despite its apparent simplicity, it yields a theory of 
serializability that is rich in combinatorial intricacies, and raises 
interesting complexity questions. Since our model is the most general 
common restriction of the models in the various references cited above, 
our negative results apply verbatim to those models. Furthermore, most 
of our positive results and characterizations are also easily generalizable 
to more general situations, although their proofs — in many cases their 
very statements — would be extremely cumbersome. Hence, we view our model 
as a convenient language, of the right degree of conceptual complexity, 
for developing and communicating our ideas about serializability, rather 
than a set of restrictions that enable the proofs of certain theorems. 
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We formalize our model of transactions in Section 2, where some pre- 
liminary results are also proved. 

In Section 3 we prove that the question of whether a given sequence 
of read and write operations corresponding to several transactions {called 
a history) is serializable is NP-complete [MHJl, [Raj. This suggests 
that, most probably, there is no efficient algorithm that distinguishes 
between serializable and non-serializable histories. 

In Section 4, we study some efficiently recognizable subsets of the 
set of serializable histories. In other words, we present polynomial-time 
"heuristics" that approximate the NP^complete predicate Of serializability— 
in a manner quite reminiscent of efficient approximations of HP-complete 
fjpl^mj^ation problems [GJ] , [PS] . We show that the two-phase locking 
strategy [EGLT] and the protocol P3 of [BGRP] are incommensurate special 
cases of two more general classes called Q and DSR*-the latter is 
related with the model of [SLR] . These two serializability principles 
are therefore very general (and applicable) new serialization methods. We 
also introduce the class SSR of histories that can be serialized without 
reversing the order of temporally non-overlapping transactions; it is not 
known whether this class is efficiently recognizable. In Section 5, we 
observe that the quite intricate interrelations among these interesting 
classes are simplified considerably if some "static" restrictions are 
imposed on the read- and write-sets. We point out there that the simple 
serializability theory of [SLR] is due to such a restriction of their model. 

For all efficiently recognizable classes of histories studied in 
Sections 4 and 5 there is also an efficient 8ohedulert an algorithm, that 
is, which takes any history and transforms it to its closest (according 
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to some appropriate metric) history within the class considered. In 
Section 6 we show that this is no accident: a class of histories has 
an efficient scheduler if and only if it is efficiently recognizable, 
plus a regularity condition, namely that its set of prefixes is also 
efficiently recognizable. By this result, the complexity theory developed 
in Sections 3 through 5 is practically relevant, because the practical 
question of the existence of an efficient scheduler for a given class 
of histories is explicitly linked to the complexity properties of the 
class. Another Implication is the negative result that, unless P - NP, 
there is no efficient "serialiser" of histories, and hence considering 
efficient but more restrictive schedulers — such as the ones discussed 
above—is a reasonable alternative. Finally, Section 7 concludes our 
treatment of the subject. We discuss there a number of possible exten- 
sions of our results such as to general (multi-step) transactions and 
distributed databases. 
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2. DEFINITIONS-NOTATION 

A "history is a quadruple h •■ (n,ir,V,S), where n is a positive 

integer; it is a permutation of the set E - {r, ,W, ,R.,,W_, . . . ,R #W } — 

n i 1 z z n n 

that is, a one-to-one function 7T:£ ■* {l,2, . .. , 2n} — such that 

n 

7r(R.) < ir(W.) for i = l,2,...,n (a permutation ir is represented by 
<fr~ (1) ,ir~ (2),...,tt~ (2n)>); finally, S is a function mapping Z to 
2 , where V is a finite set of variables. Each pair (R,,W ) will be 
called a transaction T.. S(R ) will be called the read set of T^ and 
S(W ) its write set. We shall represent histories in a compact way by 
exhibiting ir, with the sets S(«) given in brackets following each 
element of S . For example, the history h - (3,<R 1 /R 2 #W 1 ,R_,W 2 ,W 3 >, 
{x,y},S) where SfR^ - S(R 3 > -' (x>, S(R 2 > - 0, S(W 3 > - {y}, and 
f (W.) * S(W ) = {x,y} is represented as 

h - R 1 [x]R 2 W 1 [x,y]R 3 [x]W 2 [x,y]W 3 [y], 

The set of all histories is denoted by H. 

We can think of each transaction T. as starting with an instantaneous 
reading of the values in the variables in S{R.), performing a possibly 
lengthy local computation and then instantaneously recording the results 
in a different set S(W.) of variables. We do not look into the details 
of the exact nature of the local computation. In fact, we view each 
transaction T. as a set of |s(W.) | uninterpreted |s(R i > |-ary function 
symbols {f : j » 1,. ..,|S(W )\}. IT is the sequence in which these atomic 
read and write operations take place. Thus, a history can be viewed *■ » 
special case of a fork- join parallel program schema, in which the local 
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The history h» ^ [x]R 2 W 1 [x,y]R 3 [x]W 2 [x,y]W [y] viewed as a 
parallel program schema. 
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computations involve a number of local temporary variables t ± ^ and are 

executed in parallel with other read-write operations (see Figure 1). 

The oonoatenation of two histories h^- (n,T,V,S), h 2 » (m,p,V,T> 

is a history h. o h 2 - (n+m,T,V,P) , where PtW^-SG^) if i<n, and 

PlW.I-TWj ) for i>n. Similarly, P(R.)*S(R.) if i<n, and 
i i-'n * x 

P(RJ«T(R. ) for i>n. Also T(W.) - ir(W ) if i<n, and 

i i-n J- i 

T (w.) - p(w._ )+2n for i > n, t (R ± ) m it (Rj) for i «j n, t(Ri) « p-iJ^^HZa for 
i>n. In other words h»h_ is a juxtaposition of the two histories, only 
With the transactions of h, renamed. Thus, if 

h x - R 1 [x]R 2 [y]w 2 [y]R 3 W 1 [z]W 3 [y] 



and 



then 



h 2 " R 1 i x »yJ R 2M w i I y 1 "2 Izl 

h l° h 2 ' ^ ^1 R 2 [ y ]W 2 lylR 3 W l UIW 3 [yl R 4 [x ' ylR 5 [X]W 4 [ylW 5 fl1 



We say that two histories 1^- (n,ir,V,S) and h 2 - (n,Tr' ,V,S) are 
equivalent (written h = h ) if and only if the corresponding schemata are 
(strongly) equivalent. In other words, given any set of |v| domains for 
the variables, any set of initial values for the variables from the 
corresponding domains, and, furthermore, any interpretation of the functions 
f , the values of the variables are identical after the execution of both 
histories. Notice that our definition of equivalence requires that the two 
histories involve the same set of transactions. Thus h 1 -R 1 £ylR 2 W 2^' W l^ 
is not equivalent to h 2 - 1^ [y ] ^ [x] , despite the fact that their corresponding 
schemata are equivalent (essentially because T 2 is "dead" in 1^) . This is 
a matter of convenience, and little change to our derivations would be 
necessary in order to broaden equivalence in this sense. 
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To give a syntactic characterization of equivalence, it is necessary 
to first introduce some terminology. Let h* (n,iT f v,S) be a history. 
The augmented version of h is the history h« (n+2,Tr,V,S) , where 

** <R n+l' W itf-l'-' R n+2' W n+2 > and SfR^-SfR^, SCW^-SfW^ for i < n, 
and also S(R n+1 > «S(» n+2 ) = 0, S(W n+1 ) *S(R n+2 ) -V. In other words, h 
is h preceded by a transaction that initializes all variables without 
sensing any, and followed by a transaction that reads the final values of 
all the variables, without changing them. Suppose that x€s(R.). we 
say that R ± reade x from W. in k if w is the latest occurrence 
of a write symbol before » i in )i such that x€s{W.). Notice that 
since h contains * n+1 with S(» n+1 ) «v, such a write symbol always 
exists. The definition of a live transaction in h is as follows: 
a. T , is live in h. 



b. If for some live transaction T . , R. reads a variable from H. 
in h, then T. is also live in h. 

c. The only kinds of live transactions in h are defined by <a) 
and (b) above . 

The following is now a simple syntactic characterization of history 
equivalence, essentially a restatement of the characterization of schema 
equivalence in terms of Herbrand interpretations, [LPP] : 



PROPOSITION 1. Two histories ^ * (n,tr,v,S) and h 2 ■ (n,ir' ,V,S) 

are equivalent if and only if they have the same sets of live transactions, 

and a live R. reads x from W. in h, if and only if R. reads x 
i 3 1 l 

from W. in h 2 . n 
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One of the implications of Proposition 1 is thati equivalence of 
histories can be decided efficiently. The sets of live transactions can 
be found in 0(n-|v|) time by applying the recursive definition given 
above, and so can the reads from relation for transactions. Hence we have: 

COROLLARY. Equivalence of histories can be decided in 0(n*|v|) 

time. n 

The main theme of this paper is the notion of serializability . A 
history h= (n,ir,v,S) is aerial if tr(w i ) -tt^) + 1 for all i*l,2,..,,n; 
in other words, a history is serial if R^^ immediately proceeds W^ in it 
for i=l,...,n. A history h is serializable (notation: h€SR) if and 
only if there is a serial history h such that h = h . In the next 
section we shall present a syntactic characterization of serializable 
histories analogous to (and based on) Proposition 1. 
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3. THE COMPLEXITY OF SERIALIZABILITY 

In order to examine the complexity of the serializability problem, 
we need first to introduce some graph-theoretic terminology. 

DEFINITION 1. A polygraph* P» (N,A,B) is a digraph <N,A) to- 
gether with a set B of bipaths; that is, pairs of arcs—not necessarily 
in A — of the form ((v,u) ,(u,w)) such that (w,v) €A. a 

Alternatively, a polygraph (N,A,B) can be viewed as a family P(N,A,B) 
of digraphs. A digraph (N,A r ) is in P(N,A,B) if and only if ACA*, 
and for each bipath (a., a.) €B, A* contains at least one of a , a_. 
Polygraphs will be represented schematically as in Figure 2a. Arcs in A 
will be drawn as ordinary arrows, and pairs of arcs in B will be marked 
by a circular arc centered on their common node. 



DEFINITION 2. A polygraph (N,A,B) is acyclic if there is an 
acyclic digraph in P(M,A,B) . o 

For example, the digraph of Figure 2b is both in P(N,A,B) and 
acyclic; it follows that (N,A,B) of Figure 2a is acyclic. Notice that 
for a polygraph (N,A,B) to be acyclic, the digraph (N,A) must 
definitely be acyclic. 

Given any history h« (n,ir,V,S) we are going to define a polygraph 

P(h) = (N,A,B). N is the set of live transactions of ii, the augmented version 

of h. First, A contains the arcs { (T , ., ,v) :v €N-{T ^, }) , and also the 

n+1 n+l 



We insist on this terminology only because it has already become 
notorious for its impropriety. 





Figure 2. 




P»^ure ^ 
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arca ^< v » T n+ 2^ sv€N •"t T n+ 2^• Secondly' whenever transaction u reads 
some variable x from v in h, we add the arc (v,u) in A. Further- 
more, if for a third transaction w, x is in the write-set of w, then 
we add the bipath ( (u,w) , (w,v)) in B. This concludes the construction 
of P(h). 

Intuitively, P(h) captures a partial order that can be interpreted 
as "happened before", and with which any history that is equivalent to h 
must be consistent. Each arc (v,u) means that u read some variable 
from v and hence must follow it. Also, a bipath ( (u,w) , (w,v) ) means that 
v writes on the same variable, and hence cannot be in between v and u> 
it must either precede v or follow u. This is stated as a 



LEMMA 1. Two histories 1^- <n,ir,v,S) and h 2 - (n,ir',v,S) are 
equivalent if and only if PO^) and P{h 2 ) are identical. 

Proof. Both directions follow from Proposition 1 and the definition 
of P(h). a 



LEMMA 2. A history h« (n,ir,V,S) without dead transactions is seriali- 
zable if and only if P(h) is acyclic. 



Proof. If h is serializable, there exists a serial history h 
such that h = h or, by Lemma 1, P(h)«P(h). However P(h ) - (N,A,B) 

o 8 S 

is acyclic. To see this, let (T # ... f T ) be ordered according to their 
occurrence in h . We construct a digraph (N,A') £ P(P(h )) as follows: 
A' contains the arcs in A, and for each bipath ((T. ,T.) , (T. ,T )) in 
B we add to A the are (T.,T.) if i < j , or (T.,T,) if j < k. To 

^ J J JC 
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show that exactly one of these must occur, recall that in h s T^ reads 

a variable x€S(W.) from T^, and hence k<±, and not R<j<i. 

Consequently the above construction yields a digraph (N,A') in 

t?(P,A,B). Next, notice that (N,A') is acyclic since it is a subgraph 

of the total order (T _,_, ,T, ,...,T ,T ). So, P(h) is also acyclic. 

n+1 1 n n+2 

Now, let (N,A") be an acyclic digraph in f(P(h)). The serial 
history h resulting from topologically sorting (N,A') is then equi- 
valent to h . This follows from Proposition 1 and from the fact that 
since one of the two arcs of each bipath in B is in A', all transactions 

in h read all variables from the same transaction in h as they do in 
s 

h . D 

s 

Unfortunately, the combinatorial characterization of serial repro- 
ducibility shown in Lemma 2 does not directly suggest an efficient test. 
In fact, the theorem below is strong evidence that no such test exists. 

THEOREM 1. Testing whether a history h is serializable is NP- 
complete » even if h has no ■ dead transactions . 

In order to proceed with the proof of Theorem 1 we first need another 
lemma. It is well known (see [AHU],[Kal) that the satisfiability problem 
of Boolean formulas in conjunctive normal form with two or three literals 
in each clause (abbreviated SAT) is NP-complete. We can show that a more 
restricted version of this problem is still NP-conplete . Call a clause 
mixed if it contains both variables and negations of variables, and call a 
formula nonairaular if at most one of the occurrences of each variable is 
in a mixed clause. 
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LEMIA 3. SAT is NP-complete even if the formulae are restricted 
to be noncircular.. 

Proof. Consider any instance F of SAT and a variable x in it. 
Let m be the number of occurrences of x in the formula F, and let 
X l' X 2'***' X m *** new variables - Me replace x in its first occurrence 
by x^ in its second by x* 2 , in its third by x 3# etc. Finally, we add 
the clauses (^ v x 2 ) A {j^ v x 2 ) A (x 2 v x 3 ) A (» 2 v Xj) A . . . , which is the 
conjunctive normal form of x^^ = x 2 5 x 3 = x^ = • • • . Repeating this for all 
variables, we observe that the resulting formula is trivially noncircular, 
and the construction requires only a polynomial amount of time. P 

Proof of Theorem 1 . The set of SR histories is definitely in HP, 
since to show that h is SR, one only needs to construct a serial history 
h g (of length not greater than that of h) , and check by Proposition 1 that 
h and h g are equivalent. 

We will next show that a known NP-complete problem, the noncircular 
SAT problem of Lemma 3 above, reduces to SR-testing in polynomial time. 

Given any such formula F, we are going to construct a polygraph 
P p = (N,A,B) such that P is acyclic if and only if F is satisfiable. 
He will then show that P_ can be considered as P(h) for a suitable 
history h, without dead transactions. In view of Lemma 2, this will 
conclude the proof. 

We start from the construction of P p =(N,A,B). F has m clauses 

C l"** ,C m and inv °lves n Boolean variables x.. x . Each clause C, 

x m 1 ' n i 

consists of three literals ^^^vA^, where A is either a variable 
or a negation of one. N contains the nodes a. , b., c for each variable 
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x. , and y.,, z.., k« l,..,m 'for each clause C. with m. literals. For 
each variable x we add the arc (a ,b.) to A, and the bipath ((b.,c.), 
(c. ,a )) to B. For each clause C. , we add the arcs .(jtf i » z ±.y l #i) 
(addition mod m ) to A. Finally, if X^ - x, , we add the arcs (Cj.y^) 
and 0>..»z ik ) to A » and the bipath ^^-1^1^* ^±^4^ to B . If X ifc - x^ , 
then we add the arcs ( z lk » c .«) and ^±yi' a ^ t0 * , and the bipath ((*4» z ii t )» 
(z. ,7^)) to B. For example, if the literal A 1Jc is x., the subpoly- 
graph of Figure 3 will appear in P p . 

Finally, we add to » the nodes ng,n c and n f , together with. the 
arcs (n Q ,n),(n,n ) and (n,n f ) for all n€H-{n ,n c ,n f }, and also the arc 

(n ,n-). This concludes the construction of P„. In Figure 4a we illustrate 

c £ * 

the construction for the Boolean formula 

F = (X V X 2 ) A (X x V X 2 V Xg) A (X 2 VXj). 

FOr simplicity, in Figure 4 we have omitted the nodes n and n f . 

We will now argue that P is acyclic if and only if F is satis- 
fiable. Suppose that P is acyclic. This means that there is an acyclic 
digraph (N,A') € t>(P_.) . Obviously, for each j, exactly one of the edges 
(b.,c.) and (c.,a.) is in A'. Think the fact that (c.,a.)€A' 
means that x. is assigned the value true. We may immediately note that 
if a literal X. is given the value false by this assignment, the 
corresponding arc (z. ,y. ) is also in A', since otherwise, a cycle of 

the form (c. ,y i]c ,b.)-- or ( z ik ' c j' a j> if X i K "*j ~~ would exist ** (N' A ')« 
Hence, the only way for (N,A') not to have a cycle of the form 
(z ,y ,z. , ...,y. ) is that at least one literal in each clause is 
assigned the value true, which means that F is satisfiable. 
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Conversely, suppose that P is satisfied by some truth assignment 
T. We will construct an acyclic digraph (N,A') €P(p ). a* contains 

F 

all of A and the arcs (c.,a.) if T(x.) =tx*ue, (b.,c.) if 

3D 3 13 

T(x..) 'false, and the arcs (z^y^) if TC^) 'false, (y ik 'bj> if 
^ ik *x. and T(x.) "true, and (a.,z. ) if A »x, and T (x.) = false . 
Obviously, (N,A*) is in P(P F ) ; the claim is that it is acyclic. We 
first note that since F is by hypothesis noncircular, (N,A) is acyclic. 
This is because by the construction of A, the clauses containing 
variables only or negations only correspond to node sets with only in- 
coming or, respectively, only outgoing arcs; node sets corresponding to 
mixed clauses have both incoming and outgoing arcs, but no two such node 
sets are reachable from each other in (N,A), by F's noncircular ity; it 
follows that (N,A) is indeed acyclic. It is easy to check that the arcs 
in A* -A can harm the digraph's acyclic ity only by introducing a 
l z tl'Y*l '•••#y i3 ) cycle; however, this would mean that some clause has 
no true (under T) literal, and hence T does not satisfy F, a contra- 
diction. In Figure 4 we show in broken lines the arcs of an acyclic 
digraph in t?(P ) ; this digraph corresponds to the truth assignment 
T(x 1 ) = true, T(x 2 ) «/a£se, T(x ) = false which satisfies F. 

In order to conclude the proof we need to construct a history h 
such that P(h) «P . All nodes of P correspond to distinct transactions. 

To construct the read and write sets of the transactions (except for 

h-,n and n_), we start by having all read sets empty, and a variable x 

in the write set of each transaction v. For each arc (v,u) 6 A we add a 

variable x, to the write set of v and the read set of u , and for each 
vu 

blpath ((v,u),(u,w))€B we add x to the write set of u. Finally, 

wv 
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R(n Q ) - 0, W(n Q ) - {x v :v€N} {x^: (u,v)€A} - R(n f ), R(n )« {x v :u€N}, 
W(n f ) = 0, W(n ) » {x :(u,v)£A}. In order to sketch the construction of 
h, we represent the read and write operations corresponding to the node 
v of P_ by R(v),W(v) respectively. We use v to stand for R(v)W(v). 
We start the construction of h from left to right. First, for each clause 
C. consisting of just negations we add the sub history h(C.) » y.,. ..y. . 
Next, for each variable x. that appears unnegated in the mixed clause 

c i (1 * V« " x j ) we add the subhi8tor y h(x i ) " R(a j )z i m c j w(a j )R(b j ) yjiR w(b j ) 

The z. part appears only if C. is purely negated and X._ - x. . Further, 

if X - x. for some purely unnegated clause C then y_ appears also 
pqj r '° p pq 

after y^ . Then follow subhlstories corresponding to the remaining 
variables. If x. does not appear unnegated in a mixed clause, then we 
add to h the subhistory h(x.) - R(a,)z la c.W(a.)R(b,)y jlR W(b.). Again, 

y eR •PP €ara on ^y **. \o " x a for soa€ P u '«ly unnegated clause C^, and if 

x. also appears in a purely negated clause C (X - x.) then as comes 
j r '° p pq J pq 

after z.. Finally, we have h(C.) « z.....z. for each purely negated 
clause C., and at the end the transaction n . 

To argue that P_, - p(h), first note that all (y • 1 » z i 4 + i) ( mod m i) 
arcs are realized by h, and that the subpolygraph of Figure 3 is realized 
for each x. « X... and the symmetric subpolygraph for x. ■ X._. 
Furthermore, it is quite easy to check that no other arcs and bipaths are 
added by the construction. Hence P_, - P(h), which completes the proof of 
Theorem 1. n 
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4. EFFICIENTLY RECOGNIZABLE CLASSES OF SERIALIZABLE HISTORIES 

Given that SR is NP-complete, it is reasonable to look for subsets of 
SR that are efficiently recognizable. In this section we study several 
such classes of serializable histories. 

4.1 The Class DSR 

DEFINITION 3. Let h^ - (n,ir,V,S) and h 2 » (n,TT',V,S) be histories. 
We write that h , ^h whenever ir(0) - ir'(4r) for all a€Z except for 

two elements 1 '0 2 el n with ^^V " ir ' (a 2 } " $' ^-^V " *' ^l* * ^ +1 f ° r 
some 1 < j < n-1 , and either 

a. a 1 = R ; j' a ?" R ^ for some if j^n f or 

b ' a i* R i' a 2 = * W i' i ^ j ' i ' j - n ' and S<R ± ) ns(W.) -0, or 

c. ^ 1 -W i , a 2 »W , i, j<n, and 9QI ) fl Sflf .) -0. a 

As an illustration, we have that 



R 1 [x]R 2 [x]W 1 [x]W 2 [y] ^ R 1 [x]R 2 [x]W 2 [y]W 1 tx] * 
R 2 [xlRj^ [x]W 2 [yjWj^ [y] ^ Rj [x]W 2 [y^ [x]*^ [x] , 



because at each step the next history is obtained from the previous one by 
switching two adjacent symbols obeying one of the conditions (a) , (b) and 
(c) of Definition 3 above. 

The following is a direct consequence of Proposition 1 and the above 
definition: 



PROPOSITION 3. If hi^lV then h.=h 2 • 
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* 
Let ~ be the reflexive-transitive closure of ~. Since ~ is 

symnetric, ~ is an equivalence relation which is, by Proposition 3, a 

restriction of =. We can show that - is a proper restriction of = 

by observing that for the two histories 



and 



we have 



but 



h x - R 3 WR 1 W i |x]R 2 [y}W 2 W 3 [y] 



h 2 - R 2 [y]R 3 Ww 2 w 3 [y3R 1 w 1 [x3 



h l - h 2 ' 



h > h : 



2 * 



We aay that the history h is D-eerialiuabU (DSR) if there is a serial 
history h g such that h * h g . Obviously, if a history is DSR, it is 
certainly SR. 

We can associate with a history h» (n,ir,V,S) a digraph D(h) 

defined as follows: The nodes of D(h) are the transactions {T, ,...,T } 

1 n 

of h, and the pair (T^T ) is an arc of D(h) if and only if either 



a. S(R i ) ns(W.) jt and ir(R ) < ir(w. ) , or 

b. S(W ± ) ns(R.) ft and TT(W i > < ir(R ) , or 

c. Sfl^) ns(W.) ft and ir(W i > < tt(W ) . 



LEJMA 4. Suppose that for two histories h ■ (n,TT,V,S) and 
h 2 = (n f TT',V,S) D(h ) and D(h_) have no cycles of length 2. Then 
h x - h 2 if and only if D(h ) = D(h ). 
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Proof . It should be obvious from the definition of D(h) and the 
- relation that whenever h ~ h , also D(h)*D(h). Consequently, 
h ~ h_ implies D(h)=D(h_). 

For the other direction, assume that D(h.) »D(h 2 ). We shall 
transform h to h by a sequence of - transformations as follows: 
Take the symbol in Z that is the first symbol in h. (i.e., ir (1)) 
and bring it to the first place of h„ by successively switching it with all 
symbols preceding it in h ; then take ir~ (2) and bring it to the 
second position by switching it with all symbols preceding it, except 
tt _1 (1); and so on, until h is transformed to h^ It remains to show 
that all these switchings have been legal ~ transformations. Suppose 
that at some time we had to switch a with or in a manner not 
allowed by Definition 3; that is either 

a. »R , a 2" w i ? this m ** ns « however, that in h^, W. precedes 
R. , and hence h is not a history. 

b. <x, =R., a =W. and S(R. ) ns(W.) ? 0. This would mean 

1 i 2 3 1 j 

however, that (T., T.) is in D(h 2 > and (T.,T ± ) is in Dtf^). Since 
D(h ) and D(h ) have no cycles of length 2 we can conclude that 
DO^) + D(h 2 ). 

c. Similarly for <*,=*<» a 2 * W 1 and s < w i > ns ^ W j ) * $' a 

We can now prove the following Theorem. 

THEOREM 2. A history h- <n,ir,V,S) is DSR if and only if D(h) 
is acyclic. 



Proof . Suppose that D(h) is acyclic. We can thus sort 
topologically the set {t , ...,T } of nodes of D(fe). Think of this 
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order as a serial history h . It is immediate that D(h_) = D(h), and 
hence, by Lemma 4, h * h . It follows that h is DSR. 

Por the other direction, assume that h is DSR. We have two cases 

a. D(h) has a cycle (T. ,T.,T.) of length 2. This means that 

131 

ir{R i ) <Tr(w j )<ir(w i ), and s(R ± )ns(W) * 0, s (ty ft (S'CVj ) U S (ftj ) )' * 0. It is 
easy to show that in all histories h' for which h * h' we will also 
have if (R i ) < it' (W ) < ir» <w ± ) , as otherwise h + h' , and: h % h' , by 
Proposition 3. Hence there is no serial history h g such that h & h_, 
a contradiction. 

b. D(h) has no cycles of length 2. By Leana 4, there is a serial 



history h g such that 0(h) "0(h). However, serial histories h_ 
have acyclic D(hg), and nance D(h) is acyclic. □ 

Theorems suggests that histories that are DSR can be detected 
efficiently by checking D(h) for acyclicity: 

COROLLARY 1. Checking whether a history h» (n,ir,V,S) is DSR can 
be done in 0(|v|n ) time. o 

Also, we can rephrase Theorem 2 as follows (compare with 
Definition 4 below) : 

COROLLARY 2. A history h= (n,ir,V,S) is DSR if and only if we can 

find real numbers {s, ,...,S } such that 

1 n 

a. If S(W,)flS(R.) j* and 7T(W. ) < tf(R. ) then S. <S.. 

i D x 3 ID 

b. If S(R.)ns(W.) 3*0 and 7T(R. ) < ir(W. ) then S. <S.. 

13 13 13 

c. If S(W.)ns(W.) ft and tt(W. ) < ir(W. ) then S. <S.. a 
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4.2 The Class Q 



DEFINITION 4. A history h« (n,TT,V,S) is in Q if there exist 

non-integer, distinct real numbers S, ,S„,...,S with the following 

l 2 n 

properties : 

a. ttU^) <S i <7T(W i ) 

b. If S(R.)ns(W.) ><U, i^j and ir(R.) < ir(W.) then S. <S. 

i 3 i 3 1 j 

c. If S(W i ) ns(W.) + and irO^) <w<W.) then S ± <S.. 



The real numbers S, ,...,S in Definition 3 are called aeriali- 

x n 

sdbility points. Their intuitive meaning is that the history h is the 
same as though transaction T had executed indivisibly at the time 
instance S (during which, by (a) above, it was active), transaction 
$„ ft S 2 , and so on. As an illustration, the history 

h - R 1 [x]R 2 [ajW 2 [y]R 3 [z]W 3 [x]W 1 [y] 



is in the class Q, since the values S. - 3.5, S.-2.5, and S-4.5 
satisfy, as the reader can check, the requirements of the definition. 
The class Q was independently introduced by [Wo] . 

THEOREM 3. If h is in Q, then h is DSR. 

Proof . Conditions (b) and (c) of the definition of the class Q 
above are identical to (b) and (c) of Corollary 2 to Theorem 2. Hence 
it suffices to show that condition (a) above implies condition (a) of 
Corollary 2. But this is immediate, because if ir(W.) < tr(R.) we have 
that S j ,<Tr(W i ) <tt(R) <s., no matter what S(R.) and BW^) are. 
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Given a history h- (n,ir,v,S) we can construct another digraph 

D'(h)— a superdigraph of D(h)— with node set again {t, ,...,T } and 

1 n 

(T^T.) an arc if and only if one of the following holds 

a. ir(W i ) <ir(Rj) 

b. ■rr(R i ) <tt(w.) and SO^) rtsfw ) + 
e. ir(w i )<w(W) and SCW^ ns(w.) + 0. 

In other words D'(h) contains all the arcs of D(h) and possibly 
other arcs for the cases in which jf(n ) <ff(R ) and S(R.) ns(W.) - 



4. The history h- <n,TT,V,S) is in the class Q if and 
only if D'(h) is acyclic. 

Proof. Suppose that h€Q, and let S. ,...,S be appropriate 

x n 

numbers. Without loss of generality S, <S_< ••• <S . He shall show that 

12 n 

whenever (T^TJ is in D' (h) , then i<j. Suppose that i>j;bythe 
definition of D* (h) one of the following sust holdt 

a. wQK t ) <ir(R ). However, S i < ir-fly <tt(r ) < s, which contradicts 

our assumption that S, < S. < • • . < s and i > 1 . 

1 *• n 

TT{W i ) <TT(W j ) and S(W i ) ns(W ) + 0. By (c) of Definition 4, 
however, S. < S . , again a contradiction. 



b. 



c. ir(R i ) <ir(W i ) and S(R i )ns(W.) jt 0. Similarly, a contradiction 

is reached by (b) of Definition 4. 

Consequently, D' (h) is acyclic, since it is a subgraph of a total order. 

For the other direction, suppose that D'(h) is acyclic. We can 

sort topologically its nodes to obtain the order, say, (T, ,T„,...,T ). 

12 n 

We can define the real numbers S.,S_,...,S , and S ,, (for convenience) 

i <s n n+i 

as follows: 
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a - S n + 1 ■'■ 2B + 1 



b. Sj - min{s j+1 ,'rr(W j )} - ^ , J - n, n-l,...,l . 

It is clear that the S.'s are distinct, increasing, non-integer 
real numbers, and that they satisfy (b) and (c) of Definition 4. It 
suffices thus to prove (a) of Definition 4 , in particular that a^>v(\) 
for all i. Suppose that, for some i, S ± < irU^) • I*t j be the 
smallest index, no smaller than i, for which *<*j>" <s j + i' Thus 

s i - *<V -^ >1T(W j ) " X 

Consequently ir(R ± ) >ir(W j ) " 1. or HKJ >v^) . Hence (T^) € A, 
which contradicts the fact that jii in the topological sorting of 

a 
D'(h). 

COROLLARY. Testing whether a history h- <n,ir,V,S) is in Q can 
be done in 0(|v|n ) time. 



4.3 Two-Phase Locking and the Pr otocol P3 

A very influential proposal for guaranteeing serializability of 
update systems has been the two-phase locking mechanism of [EGLT1-- also 
discussed extensively in [BS] . Also, the essence of a quite different 
serializability principle (which was used in the development of the P80-1 
distributed system [rg] , IBGRP]) is captured by the so-called protocol P3 



-24- 



(see [BS] ) . In this Subsection we show that these two different 

philosophies of serializability are reduced, in our modal, to two 

efficiently recognizable incommensurate subsets of our class DSR. 

The two-phase locking strategy requests and releases actual locks— 

i.e., mechanisms that guarantee exclusive data access— during the execution 

of the different operations of an update. The rule that is proven 

sufficient for guaranteeing serializability is: never request a lock 

after a lock has been released, fie have, therefore, two phases: one 

during which locks may only be requested, followed by one during which 

locks can only be released, the first release of a lock delimits the 

two phases. In our model of two-step updates the authors of [BS] note 

that two-phase locking for a history h- <n,ir,V,S) essentially amounts 

to dividing the interval from ir(R.) to *<*,) into two intervals: 

one during which no symbol W. with S(R ) ns(W.) ? can exist, followed 

by one during which no symbol <?€£ with S(a) flS(W ) * can exist. 

n J 

This is captured by the following definition: 

DBTIN1TICW 5. A history h- (n,ir,V,S) is tuo-phaae looked 
(notation: h€2PL) if and only if there exist distinct non-integer real 
numbers i.^,.,.,1 (the lookpovnts) such that 

a. ir(R 1 ) <A i <Tr(w i ) for i-l,...,n 

b. If S(R 1 ) ns(Wj) jt 0, i^j and ir (R ± ) < ir (w . ) , then l ± < I , 

c. If S(W i ) ns(W ) jrf and TtW ± ) < TT(W ) , then ir(W i )<i.. a 

To understand Definition 5, consider a transaction (R ,W.) in a 

J 3 

history h € 2PL, and its lockpoint I . . The intuitive meaning of the 
lockpoint is the following: during the interval [ir (R . ) , S, ] all 
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variables in S(R.) are "protected" from writing by other transactions, 

by virtue of (b) . Also during the interval [A.,ir(W )] the variables 

in S(W.) are protected from reading and writing. Conditions (b) and 

(c) therefore essentially say that the interval [A. ,1T(W.)] overlaps 

no interval [Jt k ,TT(W k )] with S(W k >ns(W)^0 and no interval [7T( V'V 

with S(W.) flS(R ) ft 0. Thus, the second lock is granted before the first 

is released, in accordance with the two-phase locking principle. 

Although Definitions 4 and 5 differ only slightly in condition (c), 
the latter is a substantial restriction. First, we notice that 2PL c Q. 
Indeed, if fl€2PL then the lockpoints Jt , . - - ,A n are automatically 
valid serializability points S 1# ...,S n in Definition 4. To see this, 
just notice for that condition (c) of Definition 5 (irtVT) <£.) together 
with (a) U. <ir(W i )) imply (c) of Definition 4 (namely, S ± < S..). 
IP SHQW that the inclusion is proper, notice that for the histQjy 

h = RjR 2 R 3 [x]W 1 [x]W 2 [y,z]W 3 [y] 

we have that h£Q (see Figure 5a for D* (h) ) but h £ 2PL. The ex- 
planation for the latter fact is that transaction 3 has no lockpoint ly 
since, if it had, SL 3 should obey Jig < ^ < 4 (by (b) ) and also & 3 > 5 
(by (c)). 

We can, however, check very efficiently whether a history h is 
two-phase locked. Given any history h* (n,ir,V,S) we define the history 
h*~ (2n,ir*,V,S*) , where h* is obtained from h by inserting a 

transaction R^rW^. after W. in h for j-l,...,n; S*(R .)=0, 
n+j n+i 3 "~J 

and S*(W .) == S(W.). For example, the history h* for h of the 
n+] j 

example above is 
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(a) 



(b) 



Figure 5 
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h* - R 1 R 2 R 3 [x]W 1 [x]R 4 W 4 [x]W 2 [y f z]R 5 W 5 [y,z]W 3 [y]R 6 W 6 [y] . 

THEOREM 5. For a history h«= (n,ir,V,S) h € 2PL if and only if 
h*€Q. 

Proof. Let U, ,...,* } be a set of distinct non-integer real 

■ in 

numbers, and let a(j) be the number of positions to the right that the 

symbol tt _1 (j) was shifted in h*j in other words a(j) - 2- Itag.fVp^) < j>| 

Consider the set {s^...^^}, where S ± » l ± + aU^l) for i<n, and 

S.«tt(w, ) + a(ir(w J )) +3/2 for i>n. We claim that U ,) is an 
i i-n ±<-n * 

acceptable set of lockpoints satisfying Definition 5 if and only if 

{S . } is a set of serializability points according to Definition 4. Both 

directions follow from the definitions. The formal derivation is 

(MBit ted i ■ 

To illustrate the theorem, the history h above is in Q, since 
D'(h) is acyclic (Figure 5a). However, it is not in 2PL, because D'(h*) 
is not acyclic (Figure 5b) . Naturally, Theorem 5 yields 

COROLLARY. Testing whether a history h- (n,ir,V,S) is two-phase 
locked can be done in 0(n |v|) time. D 

We now turn to formalizing and studying in our model the protocol P3 
of [BGRP] and [BS] . Recall the digraph D(h) defined for any history h 
in Subsection 4,1 — see Figure 6a for an illustration in the case of 

h - R 1 [z]R 3 W 3 Ix]R 2 [x]W 1 [z]R 4 W 2 [y,z]W 4 [x] 
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Figure 6 



DEFINITION 6. Let G(h) be tiie undirected graph corresponding to 

0(h) —Figure 6b. A ayaU in <5(h) is a sequence (T. , T ) of 

i l Si 
m > 2 transac t i on s each that [T. ,T. ] are edges of G(h) , 



j-l. 



,m-l, and so is [T ± ,T t ] . Notice that all edges are cycles 



according to this definition. A cycle (T ,...,t ) is bad if 

h m 



and 



[s<». ) us(w. )ins(w, ) + 0, 

l m l » i i 



s(r. ) nstw. ) f 

X l X 2 



Notice that in the above definition the first node of a cycle and 
the order of listing of the nodes are important. For example, in 
Figure 6 (T^Tj) is a bad cycle, whereas (T ,T ) is not. Bad cycles 
are, intuitively, those cycles that can correspond to a directed cycle in 
D(h') for some other history h' involving the same transactions. 



DEFINITION 6 (continued). Let h= (n,ir,V,S) be a history. We say 
that Tj is a guardian of t^ if there exists a bad cycle 

* T i' T j'**"' T k* in G * n *' We **y that h obeys the protoaol P3 (notation 

h€p3) if whenever T. is a guardian of T. we do not have tt(R.) <tt(W.) <tt(W.) 

3 i i j x 
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For example, consider the history h of Figure 6. The only bad 
cycle in G(h) --Figure 6b~is (T^), and hence the guardian relation 
is simple: just T 2 is a guardian of T^ Since i& 2 ) >*<*!>, we have 
that h€P3. 

THEOREM 6. Suppose that h- <n,TT,V,S) is in P3- ^^ Lt is also 
in DSR. 

Proof , we shall show that h€P3 implies that D(h) is acyclic. 

„ /L v. i^ it t T 1 m>2. Consider the arc 
Suppose that D(h) has a cycle (™ 1 » T 2 » • * • 'V* 

(T ,T ) of D(h) —addition mod m; we have three cases: 

J J 

a. S(W)ns(W j+1 ) * and 7T(W ) < ir(W ) . 

b. S(W.) ns(R. +1 ) ft and ir(W^) < IT (R +1 ) . 

c. s(R)fls(w. +1 ) ft and ir(R.) <ir(w. +1 ) . 

Notice that in both cases (a) and (b) we have that ir(W .) < u(W. +1 ), and 
that more than one case may be applicable to the same arc. Case (c) is 
split into two subcases. 

(cl) Cases (a) and (c) do not apply to the arc (T -,T ). 

(c2) j - 1, or case (a) or case (c) applies to (T. ,,T ). 
In case (cl) we have that nQt.j) <if(R.) <ir(W 1 ). In case (c2), however, 
we notice that T, +1 is a guardian of T . Consequently, since ir(R.) < 
ir(W. + . ) we must necessarily have that ir(W. ) <ir(W. + j). 

Now, consider the operations 0., j - 1 m, where 0. = R. if 



case (cl) is applicable to the arc (T.,T. +1 ) , and 0. - W. otherwise. 
We have shown that tt(0 .) < tfCO-i+i) f ° r J " l,...i« (addition mod m) . 
This is a contradiction, since it implies that tt(W.)<it(W- ). q 
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Theorem 6 implies the following, independently proved in [BS] . 

COROLLARY. Histories that obey the protocol P3 are serializable. a 

Our next result concerns the complexity of recognizing those histories 
that obey protocol P3. By the definition of this class, this complexity 
is determined by the complexity of computing the guardian relation among 
tie transactions in a history. We shall show how this relation can be 
computed efficiently. For each transaction T., let r(T.) be the set 
of all transactions T. that satisfy S(R.) ns(W ± ) f 0. Thus, T(T.) 
is the set of all transactions that are possibly guardians of T.. To 



determine whether a transaction T^Hl.) is indeed a guardian of T., 
we delete all edges [T.,T k ] ««ch that S(W.) [S(W k ) US(R Jc )] - from 

G(h) , and then determine whether T. and T. are on the same biconnected 

2 
component of the resulting graph. This can be done in 0(n ) time by 

the algorithm of [Taj . If T ± and T . are on the same biconnected 

component, this means that there is a bad cycle (T. ,T. ,. . . ,1^) in G(fe), 

and hence T. is a guardian of T. ; otherwise, it is not. Repeating this 

2 I i 2 
for all T.'s, we get an algorithm of total complexity 0(n (|V| +n )). 

Hence we have 
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THEOREM 7. Testing whether a history h- (n,TT,V,S) €P3 £ a n be 
done in 0(n (|v|+n )) time. o 



4.4 The Class SSR 

Certain histories, though perfectly serializable , have a cur;Lous--and, 
according to sane, undesirable — property. Consider, for example, the 
history 

h - ILjxlR^MRgWjtozlW^y] . 

This history is serializable. However, the only serial history equi- 
valent to h is easily shown to be 

h S " R 3 W 3 Iy ' x,R l I ' xlW l! y3 *2 ,, 2 W . 

What is interesting is that in h transaction 2 has completed 
execution before transaction 3 has started executing, whereas the order 
in h has to be the reverse. This phenomenon is quite counterintuitive, 
and it has been opined that perhaps the notion of correctness in trans- 
action systems has to be strengthened so as to exclude, besides histories 
that are not serializable, also histories that present this kind of 
behavior. This leads to the following definition: 

DEFINITION 7. A history h« (n,ir,V,S) is said to be serializable 

in the strict sense (notation: h€SSR). If there is a serial history 

h« (n,TT',V,S) such that h = h e , and ir(W. ) < tt(R.) implies 
S b X 3 

ir.' (W.) <ir' <R.). o 
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It is not hard to verify that all' histories in the class Q satisfy 

Definition 7. To see this, recall that a history h in Q has a set of 

serializability points S, < S_ < . . . < S , say, such that h_ » R.W. • • «R W = h. 

12 n a l J. n n 

Now, if ir(W. ) <u(R.), we have, by the definition of S., S <tt(W.) <ir(R.) 

1 j X x 1 J 

< S . , and therefore i < j . Hence transactions i and j have the same ' > 
order in h_ that they have in h. It follows that Q c SSR. 

Nevertheless, the classes Q and SSR are not the sane, as con- 
jectured by [Wo] . A counter example is 

h - R 1 [2]R 2 [z]W 2 [x,«]R 3 [x]W 1 {x,y]W 3 [z]R 4 ty]W 4 [x] . 

This history is equivalent to the serial history 

h g - R 1 [z]W 1 [x,y]R 2 [z]W 2 [x,z]R 3 [x]W 3 [z]R 4 fy]W 4 [x] , 

satisfying Definition 7. However, h is not in Q; to check this, just 
notice that the digraph D'(h) shown in Figure 7 is not acyclic. It is 
not known whether the class SSR is efficiently recognizable. 




Figure 7 
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4. 5 Summary 

The topography of the set of all histories H and its subclasses 
SR, S (the serial histories), Q, SSR, DSR, P3 and 2PL is depicted in 
Figure 9. The inclusions shown either follow from the results of this 
section, or are straight-forward. We also show below an example of a 
history for each of the 12 regions in this diagram. 
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Figure 8 



6 



h 8 " 



10 



11 



R 1 [x]W 1 [x]R 2 [x]W 2 tx3 
^CxlRjCyll^txlWjty] 
\^ 3 [xjllj lx)W 2 [y,xlW 3 {y] 
Rj^ [xlR 2 W 2 [x,?]!^ Iz]R 3 W 3 [y ,z] 



h 3 . h 4 



R2 [z]R 1 » 2 [x,z]R 3 [x]W 3 tz]W x [x # y]R 4 [y]» 4 [x] 

R 3 [x] R^ [x] R2 [y ] W 2 W 3 [y ] 

R 2 [zJRj^ tz]W 2 [x, z]R 3 [xJMj^ [x,y]W 3 [z]R 4 [y]W 4 tx] 



R 1 R 3 W 3 CxlR 2 [xlW l CxlW 2 [xl 



h ? o h 4 



h 7° h 9 



h 12 " R 1 [ x l R 2 W w 1 tx]W 2 [x] 
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5. RESTRICTIONS ON THE READ- AND WRITE-SBTS 

It turns out that if we impose certain restrictions on the structure 
of the map S of a history — i.e., the read- and write-sets of the trans- 
actions in the history —the topography of H (shown in Figure 8 for the 
general, case) is simplified considerably. The most striking such result 
is that of [SBR] . a basic assumption in the model of [SLR] --which is 
otherwise more general than the present in that it allows more than two 
steps — is that no database entity (or variable) is updated, unless it has 
been previously read. In our model and notation, this means that 
S(W.) c s(R.). What is surprising, is that serialiability, an NP-complete 
predicate in our model , is efficiently decidabls in theirs • We explain 
this in view of our previous discussion as follows: 

ffHEOREM 7. Suppose that for a history h- (n,ir,V,S) we have 

S(W.) c S(R.) for j»l,...,n. Then h is serializable if and only if 
5 ~" 3 

h is in DSR. 

Proof . It suffices to show that if 8(a) flS(a 2 ) ? and 

TT(a, ) < ir(a_) for a,, ff„€E such that at least one of a , o*_ is a 
12 1 2 n i * 

write symbol, then it* (ff.) < ir' (aj in any history (n,ir',V,S) equi- 
valent to h. Suppose that C^-W^ Cj-Wj. 8(1^) and S(Wj) share 
a variable x, which, by hypothesis, is also in SfR^ and StRj). 
Consequently, in h T reads x from either T. or from another 
transaction which, by the same argument, reads x from another, and so 
on, up to T . Now, notice that the S(R.) = S(N.) assumption implies 
that in any serializable history there can be no dead transactions. Hence, 
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by Proposition 1, in any history (n,7T',v,S) equivalent to h we must 
also have tt' (W.) < tr 1 (W_) . The other two cases are settled very 
similarly. 



It turns out that the rest of the classes of histories discussed 
previously have a considerably simpler structure under the assumption 
that S(W.) c S(R 4 ). We show below, without proofs the corresponding 



T 



diagram. 




Figure 9 

under a different restriction on S, the class SSR coincides with SR. 

THEOREM 8. Suppose that in a history h» (n,ir f V,S) there is a 
subset X - {x. ,x_,...,x } c V such that for j«l,2,...,n we have 
(a) XCS(R.), (b) x. €S(W ± ) if and only if i-j. Then h is 
serial izable if and only if h € SSR. 



Sketch of Proof . Imagine that the variable x. is a Boolean sig- 
nalling whether transaction T. has completed. Therefore, if T. completed 
in h before T. started, the same must hold in any other history equivalent 
to h. a 
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6. SCHEDULERS OP HISTORIES 

The practical importance of the classes of histories 2PL and P3 
discussed in Section 4 stems from the fact that they are known to 
correspond to simple schedulers. A scheduler for a class of histories 
(to be defined formally below) is generally an algorithm that takes as 
an input an arbitrary history — possibly non-seriaJLizable—and returns a 
history which is the "closest" to the given one among those belonging to 
the class. If the class is a subset of SR, therefore, the scheduler 
guarantees that its output history is serializable . Such a scheduler 
can be used in the serializability component of the database management 
system. Of course, in practice one would expect that a scheduler operates 
on-line and is reasonably efficient. 

The history-input of the scheduler is the sequence of arriving 
user requests. The output of the scheduler is the actual execution 
sequence. The basic fact that makes our approach very different from 
previous work on concurrency control which was motivated by operating 
systems (e.g., the notion of determinacy of [CD]) is that the supplier 
of this input history is a population of users, each user being unaware 
of the actions of the others. This implies that the order of arrival 
of these requests has no semantic conten£ whatsoever, and therefore 
the scheduler is not bound to produce an output which is equivalent 
(or related in any prescribed way) to the input. In fact, the operation 
of the scheduler becomes interesting and important exactly when the 
scheduler must necessarily transform the input to an inequivalent output, 
because the input is non-serializable, say. 
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There are, however, certain performance criteria that the input- 
output mapping of a scheduler should satisfy. For example, a trivial 
scheduler which guarantees serializability is the one that outputs 
only serial histories. This is, however, too restrictive a mechanism 
to be of practical value. Intuitively, the richer the output class, 
the more powerful the scheduler, because a less restrictive class 
of histories will require less reshuffling of the operations and will 
cause fewer and shorter unnecessary delays. Ideally, we would like to 
have a serialize*, whose output spans all of SR. Unfortunately, we 
shall soon see that the existence of such a practically useful device 
is very improbable. 

DEFINITION 8. The metric d(.,.) on the set H is defined as 
follows : 

a. d((n,ir,V,S), (n,p,V,S)) - n-max{j:ir 1 <i) -p" 1 *!), i-l,...,j>. 

b. d(<m,ir,v,S), (n,p,W,T)) - * if any one of m*n, Vj«W, 

S?<T holds. ° 

The distance between two histories defined on the same set of 
transactions is therefore n minus the length of their longest common 
prefix. Notice that d(.,.) satisfies the metric axioms. A variety of 
other metrics would suffice for what follows. 

DEFINITION 8 (continued) . Let C be a non-empty subset of H. 
A scheduler fov C is a function A rH-^C such that 

d(h,A (h)) ■ min{d(h,h'):h*€c} . a 

c 
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Thus, A q can be thought of as projecting H onto C under the 

metric d(.,.). Notice that A c (h) and h will not be equivalent in 

general. The metric d(.,.) requires that A leaves histories in C 

c 

intact, and, in fact, it leaves intact as long prefixes of arbitrary 
histories as possible. 

Let us restate now the assumptions of our model of schedulers 

(a). A scheduler A minimizes the d-distance between its input 
and its output. This intuitively means that the scheduler operates on- 
line, and, furthermore, that it acts in an opttmiatie way: As long as 
the history seen so far could possibly be extended to a correct history 
(here by "correct history" we mean one which the scheduler, in its lim- 
ited sophistication, recognizes as correct, or, equivalently, an ele- 
ment of C - A (Hj) the scheduler does not intervene to rearrange read 
iH4 Wfttf requests. As a corollary, if the scheduler is fed with iti 
own output, it leaves it intact; it is therefore idmpotent } or a projection. 

This is a quite reasonable assumption to make. Although we cannot 
totally exclude the possibility of schedulers that operate otherwise 
(for example, anticipating future requests that will .make the history 
non-8erializable) , all schedulers proposed in the past satisfy this 
assumption. Any scheduler implemented by natural constructs such as locks 
[KP], [EGLT] or queues has this property. 

(b). Among all histories in C that have the longest possible common 
prefix with the input history, A selects any one as its output. Clearly, 
in practice this choice would be made so as to minimize some more refined 
metric d*. However, the results obtained below for our weaker metric 
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d' would apply to more relaxed metrics, too. 

We say that A c is an efficient scheduler if A is computable in 
polynomial time. Our goal in this Section is to understand which classes 
of histories have efficient schedulers. It is tempting to conjecture 
that if a class is in P, then it has an efficient scheduler. This 
conjecture is not plausible, because, consider the following: 

EXAMPLE. Let E- {h«h g :hg is serial, and h = h g }. 

Obviously, E can be recognized in polynomial time; the algorithm 
involves splitting a given history in two halves, testing whether the 
second half is serial, and whether the second half is equivalent to the 
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first. However, it is also easy to see that E acomot have an efficient 
scheduler, unless ?»HP. Suppose that E has an efficient scheduler 
K-. Then we could test whether an arbitrary history h is serializable 
by first computing A_(h«h), and then checking whether A^hoh) starts 
with h. Since A^ is supposed to leave unc h a ng ed as long prefixes of 
its input as possible, it wiU alter the first half of h*h only if h 
is not serialisable. Since serialiasability is known to be KP-coMplete, E 
cannot have an efficient scheduler unless ?~NP. a 

Our next result essentially says that efficiently recognizable 
classes have efficient schedulers, unless they are as pathological as 
pur example E above. Let h« (n,ir,V,s) be a history, considered now 
as a string of symbols representing n,V,S and the permutation ir. 
A prefix of h is an initial segment of this representation, containing 
the encoding of n, V, S, as well as an initial part of ir— i.e., 
<TT" 1 (l),ir" 1 (2),...,ir" 1 (j)> for some < j < 2n. If C is a class of 
histories, then PR(C) is the set of all prefixes of all histories in C. 

THEOREM 9. Let C be a subset of H. C has an efficient scheduler 
if and only if PR(C)€P. 

Proof . Suppose that C has an efficient scheduler A Q . In order 
to determine whether a string g is a prefix of a history h€c we may 
act as follows: we first verify that g contains encodings of n/V, 
and S, together with an initial segment p of a permutation ir of S n . 
m then generate a oompUHon p of p by juxtaposing to p the 
symbols W. such that R. but not W. is present in P, and then 1i»e 
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strings R.W. for all j's such that neither R, nor W appears in 

p. We then calculate h" ■ A q ( (n,p,V,S) ) . It is straightforward to see 

that g is a prefix of h* if and only if g€PR(C). Thus we can 

efficiently determine whether g€PR(C). 

For the other direction, suppose that PR(C) €P. Based on the 

recognition algorithm for PR(C) we design an efficient scheduler: A 

c 

shown in Figure 10. A fi computes A c <h) - <n,P,V,S) by determining P 
element-by-element. It should be obvious that A Q operates as 
prescribed within a time bound of 0(n 2 C(n,|v|)) , where C(n, |v|) is 
the complexity of recognising P*(C) . The theorem follows. a 

It is now easy to link the discussion of Sections 3 and 4 with the 
existence of efficient schedulers, lie get two types of results t 

COROLLARY 1. Unless P-MP r 3R has no efficient scheduler. a 

COROLLARY 2. The classes S, 2PX., P3, Q, DSR have efficient 
schedulers. 

Proof . We have shown that these sets are in P; it is usually 

straightforward to show that their sets of prefixes are also in P (this 

is not a general property of P; there are languages in P that have 

non-recursive sets of prefixes) . As an illustration, we will sketch a 

proof that PR(P3) € P. First, given an encoding of n, V, S, and a 

segment p of IT, we first compute from S the digraph F of the guardian 

relation among {T.,...,T }. We next make sure that whenever T. is a 

in j 

guardian of T ± and p(W ) is defined, then either p(W.)<p(W.), or 
p(R i )>p(W.), or p(R i ) is undefined. Finally, we make sure that p 
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Scheduler A 
c 

Input: a history h= (n,ir,V,S) 

Output: a history h' = (n,p,V,S) EC such that d(h,h') is the 
smallest possible, if such an h' exists. 

begin 

if (n,< >,v,s) t PR(C) then return 

comment < > is the empty permutation; 
else begin 

p:=< >; 

for j = 1 , — , 2n do 
begin 
done: = false; 

for i = j , j+l , , 2n do until done 

if (n,< 1 p,TT _1 (i)>,V,S) €PR(C) then 



begin 

done: = true; 

interchange t~^(i) and ^ ~ (j); 



p: = <js ,TT (i)>i 



end; 
end; 
end; 
return (n,p,V,S); 
end 



Figure 10 
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can be completed In a manner not violating P3. It turns out that this 
amounts to verifying that the restriction of F to the transactions 
that are active (i.e., P (R.) is defined but P (W ) is not) is acyclic 
(a discussion of this part follows the proof) . Hence we have an 
efficient algorithm for PR(P3) . a 

He show in Figure 11, without proofs, stylized versions of efficient 
schedulers for the classes 2PL (lib). P3 (11a), DSR and Q (lie; for Q 
We also include the two statements labeled Q) . Besides serializability, 
these algorithms must also guarantee the absence of deadlock*. The 
issue of deadlocks appears to be orthogonal to that of serialisability, 
and, in fact, clever serializability methods are known to introduce 
increased danger of deadlocks. of the "circular waiting" variety ( [CD] , 
pp.40-(0). a unified treatment of serializability and deadlocks in a 
restricted data model is attempted in [SK] . In all cases of interest to 
us, deadlocks can be prevented by testing a dynamically changing deadlock 
graph for acyclicity. For example, in two-phase locking deadlock can 
occur if a number of transactions have each locked their read-set, and 
are awaiting for each other to release their locks. Hence, in this case 
the deadlock graph has variables as nodes, and has an arc from x to y 
if and only if some transaction currently on phase 1 reads x and writes 
y. In P3 the deadlock graph is the restriction of the guardian relation 
to the currently active transactions — this was mentioned in the proof of 
Corollary 2 to Theorem 9. Finally the deadlock graph in DSR (resp., Q) 
has as nodes the active transactions and includes the arc (T.,T. ) if 
and only if there is a path from T. to T. in D(h) — resp. D' (h) — 
and S(W ± ) ns(W.) ? 0. 
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Our notation in Figure 11 assumes that the process R. or W. is 
initiated as soon as a corresponding read or write requests arrive. 
We use constructs such as when (denoting the awaiting for a condition) 
and {.begin. . .iend (bracketing statements that are to be executed 
indivisibly) . It should be obvious that these algorithms can be 
implemented deterministically and efficiently on any standard model of 
computation . 
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VTOoeee R. 

when the deadlock graph with T. is acyclic do 
output (R. ) 



process W 

when T . is not the guardian of an active transaction do 
output (W.) 



(a) 



proceee R . 

when the deadlock graph with T. is acyclic and 



no variable is S(R.) is read-locked do 



ibegin 



write- lock all variables in S(R.)j 

output (R. ) 
3 



lend; 



when a process W ± with S<W i ) ns(R.) + or i» j has been initiated and 



no variable in S(W.) - S(R.) is writelocked do 



ibegin 



write-lock and read-lock all variables in S (W . ) ; 
un-write-lock all variables in S(R.) - S(W.). 



iend 



process w 

when R, has terminated do 
ibegin output (W . ) 

unlock all variables in S(W.) 

iend 

(b) 
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PV0068B R. 

declare L. sequence of symbols in Z U {f } 

aonment L . contains all H ± or V ± such that T^, is reachable by a 
path from T in D (resp. D'), up to this point; 

when the deadlock graph is acyclic and for no T^, f T k 

with s(R.)ns(w i ) * 0, s(R j )ns(w k ) + is \^.\ do 
ibegi-n 

output (R.) 

add R to all L containing V ± with SCR^) S(W 1 > + 
Q: add R. to all L. containing f 

■tend 



■pvoaess w. 

when the deadlock graph contains no arc (T^T^) do 
ibegin 
output (w.) 

add W. to all L. containing CT such that S(W.j) n S(a) 7< 
Q: add f to all L. containing R. or W. 
set L. : = 
iend 

(c) 

Figure 11 
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7. DISCUSSION 

We shall consider extensions of our results in three directions: 
general multi-step transactions, interpreted transactions, and 
distributed databases. 



7.1 Multi-step Transactions 

He shall briefly discuss how our entire development of Sections 2 
through 6 can be easily extended to a far more general multi-step model of 
transactions. Ms consider transactions that consist of sequences of 
steps; each step may involve both reading and writing. The values written must 
be considered as uninterpreted functions of all variables read at the 
present or previous steps of the same transaction. Our definition of 
liveness now applies to individual steps of transactions. Mo further 
modifications are necessary for stating the analog of Proposition 1. 

Serializability is obviously NP-complete in this model , as it- 
subsum e s ours. Assuming that no transaction reads intermediate results 
of another or reads two different versions of the same variable at two 
different steps — in which case the history is not serial izable — Lemma 2 
is also valid. The four serializability principles discussed in Section 4 
remain virtually unchanged — in fact, two-phase locking was initially pro- 
posed for a similar model in [EGLT]. For another example, we shall describe 
in a somewhat more detailed manner the generalized P3 class of histories. 
In the multi-step model a step s of a transaction can be an (i, j ) -guardian 
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of another transaction , where i < j are steps . This means that 8 
interacts with i— i.e., either its write set includes variables of i, 
or vice-versa — and there is a chain of interactions from s to j. If 
this is the case, s is not allowed to occur between i and 3. This 
P3 protocol always yields DSR (and hence serializable) histories. 
For the classes DSR and Q, we have similar graphs D(h) and D'(h). An 
arc (T.,T.) is in D(h) if a step of T. interacts with a subsequent 
step of T.. For D'(h), it may just be that the last step of T ± 
precedes the first step of T.. The acyclicity of D(h) again guarantees 
serializability, and that of D'(h) strict serial izability. Hence, these 
remain two most general serializability techniques, subsuming two-phase 
locking and P3, in this general setting, too. 

Finally, it is easy to see that the results of Section 6— -the. 
necessary and sufficient condition for the existence of efficient 
schedulers and its corollaries — apply even more directly to multi-step 
histories. We hope that the reader is by now convinced that introducing 
general multi-step transactions would have resulted in an unmanageably 
cumbersome notation but in very few new important ideas. 



7.2 Interpreted Transactions 

A significant departure from our model would be to look more closely 
into the computations performed by the transactions and exploit their 
details for studying serializability— or correctness, in general. If 
only syntactic information about the transactions is available (e.g., the 
read- and write-sets) then serializability can be formally proved to be 
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the right concurrency concept [KF]. If, however, semantics of the 
functions performed, or even the integrity constraints, are known, then 
it may be the case that more liberal concurrency principles than seriali- 
zability are applicable. An example is the correctness theory proposed 
in [Lai], where the concurrency control mechanism takes into account in- 
formation about the semantics and integrity constraints supplied by correct- 
ness proofs of the individual transactions. Hie extent to which such 
information is helpful is investigated in [KP]. 

It is doubtful whether complete semantic information can be used 
effectively for concurrency control. Any reasonably complex domain of 
interpretation (e.g., arithmetic) would soon make the serializability 
problem undecidable. There should be, however* ways to use partial 
semantic information in order to Improve our understanding of seriali- 
sability. One possibility is to use the fact that two transactions 
perform precisely the same function; one of the Implications is that they 
commute. It is not too hard to see that this adds nothing to the model 
developed thus far. Incidentally, this allows us to extend our original 
model so as to permit multiple occurrences of a transaction in a history. 

Another possibility would be to selectively consider certain very 
simple transactions to be interpreted. A good example of a very common 
transaction that performs a well-understood function is the 
a transaction that reads x and later records its value at y. Serializa- 
bility become trickier. For example the history 

h - R 1 [x]R 2 R 3 [x]W 2 rx]W 3 [y]R 4 [y]W 4 [x]R 5 [x]W 5 [z]W 1 [z] 
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is not serializable in our ordinary sense, but becomes equivalent to the 
serial history h « T,T. T.T-T once we assume that transactions 3 and 4 

o D X £ J 4 

are copiers. Proposition 1 becomes somewhat more complex in the 
presence of copiers. However, it is interesting to note that if copiers 
are restricted not to read variables from other copiers, then the 
introduction of copiers adds no strength to our model, and Proposition 1 
and Lemma 2 remain unchanged under this assumption. This remark plays 
an important role in the next topic of our discussion. 
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7.3 Distributed Databases 

There is a large body of literature aiming at the understanding of 
the quite elusive notion of distributed computing (see, for example [La2]). 
Distributed databases have inherited some of the intricacies of this 
area [RG] , [Th] . We shall limit our discussion to the case of two 
complete copies of the database in different locations, although there 
are difficulties which first appear in the cases of three copies or of 
selective redundancy [BSRG] . A major problem is, what happens when a 
transaction is run in one location, thus changing only one of the two 
copies. A simple technique for solving this would be to send an update 
message [BGRP] to the other location as soon as the transaction has 
completed. We have therefore a sequence of genuine transactions and 
update messages running in the system, and we can thus view the two 
copies of the database as a single database — think of the two copies of 
the variable x as two variables x. and x_. 

A difficulty appears when we try to define a history. The distributed 
nature of our computation, the communication delays and imperfect clocks 
make temporal priority-- on which our ordinary notion of history was 
based— less tangible. The observation here is that mistakes in our 
arrangement of the events which are due to the above factors preserve 
history equivalence. Hence, we can put together a history— the global 
log of [BGRP] — as long as it is consistent with local priorities and 
arrivals of messages. Mow, the update messages are in fact just copiers, 
and they only read variables that were updated by ordinary transactions. 
Hence the last remark of the previous Subsection is applicable, and the 
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serializability problem has been reduced to the one already studied! Of 
course, we are not just looking for serializability, but for the 
existence of an equivalent serial history in which an update message 
immediately follows the corresponding transaction. This, however, does 
not change the essence of the task. All our special case results hold 
with very minor modifications. 

What is considerably more complex in the distributed context is 
the subject of schedulers. There is no obvious neat way to compile 
syntactic restrictions on the global history into distributed algorithms 
that achieve them. It therefore appears that distributed history 
schedulers must concern themselves with the details of the underlying 
model of distributed computation in order to implement the intended 
serializability principle; the formidable algorithms of [Th] and [BSRG] 
iUtt*1»*«tte this point. Nevertheless, it is still natural to QQn}«ft\W« 
that the more general ideas related to the classes DSR and Q would 
prove advantageous in the distributed environment as well. 
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7.4 Open Problems 

We have proposed a formalism for the concurrency control problem for 
databases. There are two aspects of this formalism that may limit its 
applicability, and must therefore be modified in a second attempt. One 
is our basic assumption, manifested throughout the paper, that the syntactic 
description of all transactions to occur in the history la known to the 
scheduler a priori. It is not clear how to remove this assumption, and 
still retain the wealth of available solutions. One way would be to have, 
following [BSRG], a certain number of prototype transactions — or alas see r — 
to one of which any arriving transaction can be matched. Another way out 
would be to adopt only tranaaation-driven concurrency controls. Two-phase 
locking [EGLT] is an example of such a concurrency control, and so would 
be any other locking scheme. The limitations of such approaches are 
studied In [KP]. On the other hand, it is possible that variants of the 
schedulers presented here could also be Implemented in a transaction-driven 
manner. 

Secondly, our way of evaluating the performance of schedulers is also 
in need of an improvement. We propose only a qualitative measure of the 
performance of a scheduler— namely the set of all output histories. This 
leads to only a partial order of schedulers. This was shown to be a 
reasonable and useful approximation of reality when the goal is to derive 
indicative results or compare general principles of serializability. It is 
clear, however, that a more concrete measure of performance is needed for 
more practical applications. One promising direction would be to somehow 
count the total number of delays imposed on requests— at a first approximation, 
the number of transaction steps that cannot execute immediately upon arrival. 
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This would be a refinement of our measure: our measure, roughly speaking, 
assigns a perfect score to all histories that remain the same, and zero 
score to all histories that are changed, however small the change. A 
more refined measure might even put to test some of our assumptions, like 
the "optimistic scheduler" assumption (Section 6): in certain cases it 
may be preferable to intervene and modify slightly the history, when 
serializable completion becomes extremely unlikely, although not impossible. 
Naturally, adopting a more concrete measure of performance for schedulers 
will most likely require the introduction of specific and pragmatic details 
of the particular application, and the overall approach may have to be 
probabilistic. 

By considering only serializability as our notion of correctness we 
have somehow limited our scope. Examples of concurrency control techniques 
more general than serializability can be found in [Lai] and [EL]. They 
are arrived at by assuming that the scheduler has more than syntactic in- 
formation about the transaction system that it handles — e.g., semantic 
information or understanding of the integrity constraints. It is pointed 
out in [KP] that serializability is just one point in the trade-off 
between information and performance of schedulers. However, we feel that 
there is something natural about the use of syntactic information for con- 
currency control, and the importance of concurrency techniques stronger 
than serializability is of limited practical value. 

Finally, we recall two other problems that are left open here: the 
complexity of recognising the class SSR, and developing techniques for 
designing distributed schedulers from syntactic specifications. 
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